Hierarchical Models for Indirect Observation in Botnet Population Estimation: A Case Study Using Conficker-C
نویسنده
چکیده
This proposal describes a model for the hourly number of peer-to-peer connection requests generated by a single machine in the Conficker-C botnet, as viewed by an observer who can monitor a fixed proportion of Internet address space. The goal of developing this peer-to-peer behavioral model is to estimate the number of infected machines in the botnet when machines cannot be directly observed. Network behavior is often observed through the filter of IP addresses, where the mapping between machines and addresses is not oneto-one. Because all machines in the botnet are infected with the same malicious code base, their malicious behavior can act as a more stable representative for a single machine than an IP address. Parameter estimation for a single infected machine can be achieved using informed priors and MCMC methods with Metropolis-within-Gibbs sampling to account for time-dependent heterogeneity in connection rates. For population estimation in the presence of indirect observation, a confusion matrix is introduced, that can be used to represent different mappings of machines to IP addresses commonly used in network infrastructure. We propose MCMC methods for addressing estimation of single-host parameters and population hyperparameters, as well as population estimation across a large set of independent networks. Section 1.1 and Section 1.2 motivate the population study and give an overview of the main challenges for applying traditional population estimation methodology to network phenomena. Section 2 introduces the Conficker-C population and the methods for observing it in more detail. Section 3 describes the model for observing a single infected machine, and introduces the framework for extending the single-host model to a network model. Section 4 describes data collection and proposes the steps to be taken for the completion of the dissertation.
منابع مشابه
Cert Research Annual Report
Botnet size is often reported as a number of IP addresses, but the link between IP addresses and infected machines is more complicated than a simple one-to-one relationship. To count the number of infected machines when we have only an aggregated view of a botnet, we suggest building a precise probability model of the observable behavior of a single machine, and applying that model to the aggre...
متن کاملCert Research Annual Report
Botnet size is often reported as a number of IP addresses, but the link between IP addresses and infected machines is more complicated than a simple one-to-one relationship. To count the number of infected machines when we have only an aggregated view of a botnet, we suggest building a precise probability model of the observable behavior of a single machine, and applying that model to the aggre...
متن کاملA Probabilistic Population Study of the Conficker-C Botnet
We estimate the number of active machines per hour infected with the Conficker-C worm, using a probability model of Conficker-C’s UDP P2P scanning behavior. For an observer with access to a proportion δ of monitored IPv4 space, we derive the distribution of the number of times a single infected host is observed scanning the monitored space, based on a study of the P2P protocol, and on network a...
متن کاملPost-Mortem of a Zombie: Conficker Cleanup After Six Years
Research on botnet mitigation has focused predominantly on methods to technically disrupt the commandand-control infrastructure. Much less is known about the effectiveness of large-scale efforts to clean up infected machines. We analyze longitudinal data from the sinkhole of Conficker, one the largest botnets ever seen, to assess the impact of what has been emerging as a best practice: national...
متن کاملAutomated and Scalable QoS Control for Network Convergence
Brent Hoon (Brent ByungHoon Kang, UNC at Charlotte) said that although Conficker had been contained (by buying up the domain names to be used in C&C) in 2009, Conficker has never disappeared. They are still seeing hits, and there appear to be about six million IP addresses that show signs of infection. Stefan Savage wondered how you could economically get patches to six million people. Someone ...
متن کامل